-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
WIP/ERR: raise consistent errors for non-existent files #30264
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
pytest.mark.xfail( | ||
reason=( | ||
"Needs to distinguish between a path to read " | ||
"vs a string to parse" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@WillAyd any thoughts on this? I think I saw a semi-recent discussion about trying to guess whether a string is supposed to be a path or actual content.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea this was a discussion with @datapythonista and @gfyoung . I don't have a strong point of view in any direction but would be against jumping through a lot of hoops to disambiguate.
I guess if we cared to disambiguate could probably just look for [
, {
, "
or any leading number (ignoring whitespace) to start as those are the requirements for JSON
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would leave the JSON disambiguation alone for now. I'm still not really that inclined to change at the moment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking through old issues, there are a lot that revolve around disambiguating paths vs content, e.g. #5924 for pickle. (one clever suggestion was to have pd.read_pickles
, pd.read_jsons
, etc as analogues to the stdlib json.loads
)
This doesn't need to be resolved for this PR, but I think we're going to need to have a decision about whether/when/how to guess if a string is a path or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we have to be really careful about this
as this basically reinventing the wheel and this raises lots of security concerns
If there's an existing wheel we can use, I'm all for that. Going through the issues, I'm going to try to round up all of the "should we accept strings too?" etc issues so we can at least discuss them all at once. Also happy to break parts of this off if anything can be identified as "easy" |
Closing, will re-implement in smaller pieces. |
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff
xref #29102 the json case is xfailed here.